Ligation-assisted homologous recombination enables precise genome editing by deploying both MMEJ and HDR

Abstract CRISPR/Cas12a is a single effector nuclease that, like CRISPR/Cas9, has been harnessed for genome editing based on its ability to generate targeted DNA double strand breaks (DSBs). Unlike the blunt-ended DSB generated by Cas9, Cas12a generates sticky-ended DSB that could potentially aid precise genome editing, but this unique feature has thus far been underutilized. In the current study, we found that a short double-stranded DNA (dsDNA) repair template containing a sticky end that matched one of the Cas12a-generated DSB ends and a homologous arm sharing homology with the genomic region adjacent to the other end of the DSB enabled precise repair of the DSB and introduced a desired nucleotide substitution. We termed this strategy ‘Ligation-Assisted Homologous Recombination’ (LAHR). Compared to the single-stranded oligo deoxyribonucleotide (ssODN)-mediated homology directed repair (HDR), LAHR yields relatively high editing efficiency as demonstrated for both a reporter gene and endogenous genes. We found that both HDR and microhomology-mediated end joining (MMEJ) mechanisms are involved in the LAHR process. Our LAHR genome editing strategy, extends the repertoire of genome editing technologies and provides a broader understanding of the type and role of DNA repair mechanisms involved in genome editing.


Supplementary figure 2 (A) Verifying CAPR by Sanger sequencing
To confirm the repair of EGFP ∆fluor reporter cells by CAPR (including 'single-cut' controls), EGFP positive cells were FACS-sorted in both bulk and single cells. Genomic DNA was isolated from both bulk EGFP positive cells and from 20 EGFP-positive clones respectively. The target locus on the genomic DNA then was amplified by PCR, and the PCR products were further analyzed by Sanger sequencing.
Chromatograms, from top to bottom, exhibited unrepaired fluorophore-removal locus of the EGFP ∆fluor reporter gene, the EGFP fluorophore restored by the homologous repair insert with dual or single AsCas12a cleavage and the repair using a non-homologous repair insert. In the chromatograms showing the repaired EGFP ∆fluor reporter genes, the recovered EGFP fluorophore coding region was covered by green shadow. Because all the sequencing data did not show mutations, we only present one chromatogram from each group.

(B) Indel efficiencies of AsCas12a-created 'left cut' and 'right cut' on the EGFP ∆fluor reporter gene
To assess the indel efficiencies of two AsCas12a cleavage sites, 'left cut' and 'right cut', on the EGFP ∆fluor reporter gene (we used these two sites in CAPR studies. Figure 1). AsCas12a protein and crRNA targeting each site were respectively transduced into the single-copy EGFP ∆fluor reporter cells.
Following genomic DNA isolation, the PCR amplified targeting locus was used to perform T7E1 assay.
The schematic displays the predicted cleavage patterns of uncut, left-cut and right-cut PCR products after T7 endonuclease treatment. The agarose gel showed T7 endonuclease-resulting fragments. Band sizes are indicated at the right side of the gel image. The indel efficiencies were quantified using ImageJ analysis of the gel image.

Supplementary figure 3 Optimizing quantities and ratios of LAHR components (A)
To examine how AsCas12a protein concentration influenced the LAHR efficiency, we tested a AsCas12a protein gradient using the EGFP Y66S mutant reporter cells. At a final concentration of 15 µM 6 AsCas12a, LAHR efficiency plateaued. Error bars correspond to the standard deviation of the average of n = 3 parallel samples. The experiment was repeated three times and a representative dataset is presented here. (B) With different molar ratios between AsCas12 protein and crRNA, LAHR was performed to repair the EGFP Y66S mutation. The experiment demonstrates that 1:4 is the optimized ratio.
Error bars corresponded to the standard deviation of the average of n = 3 parallel samples. The experiment was repeated three times and a representative dataset was presented here. (C) The repair efficiency showed a positive correlation with the increasing amount of LAHR template. Error bars corresponded to the standard deviation of the average of n = 3 parallel samples. The experiment was repeated three times and a representative dataset was presented here.

Supplementary figure 4 Cell viability assessment following iTOP transduction of LAHR components
The post-iTOP cell viabilities of different cell lines were assessed by MTS assay. The MTS assay was performed 24 hours after iTOP transduction. The bar graphs show the viabilities of different cell lines that performed iTOP deliveries of 80-bp LAHR or 100-nt ssODN template in different quantities, together with AsCas12a protein and crRNA. We observed no significant difference in cell viability between the empty iTOP and the no-template DNA controls and any of the tested template samples. Error bars corresponded to the standard deviation of the average of n = 3 parallel samples. The experiment was repeated three times and a representative dataset was presented here. Statistical test: two-tailed unpaired t-test, ns P > 0.05.

Supplementary figure 5 Supplementary comparisons for Figure 3A
Here, we examined HDR efficiency using reverse complementary ssODN templates (ssODN1 and ssODN2) in the EGFP Y66S reporter cell line. Since the AsCas12a PAM (green arrow) and SpCas9 PAM (orange arrow) had the same orientation, for both, the top strand was the 'non-target strand' and the bottom strand was a 'target strand'. The ssODN1 from the 'target strand' favored the Cas9-mediated HDR, while the ssODN2 from the 'non-target strand' favored Cas12a-mediated HDR. This result is consistent with the previously published data (1,2). Error bars corresponded to the standard deviation of the average of n = 3 parallel samples. The experiment was repeated three times and a representative dataset is presented here.

Supplementary figure 6 FACS and NGS analyses for LAHR in repairing the 'A200C' mutation
The 'A200C' mutation that turned EGFP off in the EGFP Y66S reporter cells is indicated as a red C/G pair.
LAHR templates I to VIII contained base substitutions (blue base pairs) to introduce silent mutations.
FACS data (Green bars) showed the percentages of EGFP positive cells after LAHR. The gray bars indicate the repair efficiency of the A200C mutation using the indicated templates. Error bars correspond to the standard deviation of the average of n = 3 parallel samples.

Supplementary figure 7 Improving LAHR efficiency by PAM sequence or seed region disruption (A)
In this study, we used the EGFP Y66S reporter cell line described above. The targeting locus is indicated as a stretch of gray-shadowed dsDNA, and the A200C mutation is shown as a red C/G pair.
The AsCas12a PAM sequence is indicated by a green arrow on top, and the seed region is green-11 underlined. To examine the effects of PAM sequence or seed region disruption on LAHR editing efficiency, we designed different LAHR templates containing silent mutations to destroy PAM sequence (Mut 1), seed region (Mut 2) and both (Mut 3) as indicated. The 'Control 1' template was a LAHR template with unchanged PAM and seed sequences. In addition, for comparison, we also included ssODN templates containing same mutations as those in the LAHR templates, which were 'Mut 4' with mutated PAM, 'Mut 5' with mutated seed and 'Mut 6' with both. The 'Control 2' template was an

Supplementary figure 8 Delivering Cas12a RNP and LAHR template with electroporation
To examine the applicability of LAHR with a non-iTOP delivery method, we electroporated AsCas12a RNP and LAHR template into the single-copy EGFP Y66S cells. We tested the LAHR efficiency in two quantity setups, 50 pmol and 500 pmol, and the molar ratio between components is 1:1. The graph depicts the percentage of EGFP-positive cells from the FACS analysis. Error bars correspond to the standard deviation of the average of n = 3 parallel samples. The experiment was repeated three times and a representative dataset is presented here.  in an siRNA-free control transfection sample. Error bars correspond to the standard deviation of the average of 3 biological replicate groups, in each of which 3 technical replicates were included. Statistical test: two-tailed unpaired t-test, ns P > 0.05, * P < 0.05, ** P < 0.01, *** P < 0.001.

(B) Experimental setup of LAHR under siRNA knockdown conditions
Schematic representation of the experimental setup. In brief, siRNA transfections (using selected siRNAs from (A)) were performed on day 0. Two days after, each siRNA-transfected sample was split into two plates. One group of plates was prepared for the iTOP transduction on day 4, followed by FACS analysis on day 6 to determine the repair efficiencies under different knockdown conditions, while cells on the other group of plates were cultured till day 6 for RNA isolation and the qPCR analysis to examine the knockdown efficacies of the target genes at the time point when LAHR were applied.

(C) Confirmation of the siRNA-knockdown efficacy by qPCR
Analysis of the knockdown efficacies of target genes at the time point of LAHR editing by qPCR. The bar graphs demonstrate the gene expression level. 'WT' indicated the expression level of the target gene in an siRNA-free control transfection sample. Error bars correspond to the standard deviation of the average of 3 technical replicates. Statistical test: two-tailed unpaired t-test, * P < 0.05, *** P < 0.001.

Supplementary figure 11
Repairing the EGFP Y66S mutant by LAHR following a classic-MMEJ pathway (A) Schematic representation of a LAHR template containing a 3' homologous overhang, which forces the ligation step of LAHR to follow a classic MMEJ pathway. The targeting locus is presented as a stretch of dsDNA containing a mutation (red dots on both strands), the AsCas12a cut site is indicated by purple arrow heads and a homologous region is indicated with brown blocks. The LAHR template (with a 3' overhang) contains a base substitution (green dots on both strands) and a 3' homologous overhang (a brown block). After AsCas12a cleavage, the homologous region on the editing locus was double-stranded, which only matched the 3' homologous overhang on the repair if a compatible 3' homologous overhang is created by resection. In the green box, the classic MMEJ pathway is shown.
(B) An LAHR template containing a 3' homologous overhang was used to repair the EGFP Y66S mutation.
A standard LAHR template (containing a 5' overhang) was used as control. The bar chart shows the comparison of the repair efficiencies as percentage of the EGFP-positive cells determined by flowcytometry analysis. Error bars correspond to the standard deviation of the average of n = 3 parallel samples. The experiment was repeated three times and a representative dataset is presented here.